home *** CD-ROM | disk | FTP | other *** search
Text File | 1991-02-24 | 129.3 KB | 2,930 lines |
- .pl
- OCRSHR22.ZIP
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- OCRSHARE Release 2.2
- Shareware User's Manual
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- Solution Technology Inc.
- 1101 S. Rogers Circle, Bldg. 14
- Boca Raton, Florida 33487
- Tel. (407) 241-3210
- Fax. (407) 241-3251
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- Copr (c) 1984-1990 Solution Technology, Inc. page 1
- OCRSHR22.ZIP
-
-
- Table of Contents
- ==================
- Front Matter
- Copyright Acknowledgements....................... 1
- License.......................................... 1
- Bulletin Boards.................................. 2
- Public Domain and Shareware Libraries............ 3
- Association of Shareware Professionals (ASP)..... 4
- Registration..................................... 4
-
- Introduction......................................... 5
- Why OCR Software?................................. 5
- Features.......................................... 6
- OCRSHARE...................................... 6
- ATXSHARE...................................... 6
- Advantex OCR.................................. 6
- Ver 2.2 Enhancements.............................. 7
- How to Use This Manual........................... 7
- Tutorial...................................... 7
- Trouble Shooting.............................. 7
-
- Installation......................................... 8
- Computer Requirements............................. 8
- Running from a Hard Disk.......................... 8
- Notation.......................................... 8
- Hard Disk Installation............................ 9
- Modifying CONFIG.SYS.............................. 9
- Files & Buffers................................... 9
- Starting OCRSHARE................................. 10
- Forcing a Display Adaptor......................... 10
- Screen Selections................................. 11
- Mice.............................................. 11
- Networks.......................................... 12
- EMS Memory........................................ 12
-
- Tutorial............................................. 15
- Starting OCRSHARE................................. 15
- Moving Around..................................... 15
- The Six Basic Keyboard Keys................... 15
- Short Cut Keys................................ 16
- Exiting OCRSHARE.................................. 16
- Perform OCR on an Image File...................... 17
- Load a scanned image file..................... 17
- Exploring the loaded image.................... 17
- Load an OCR Recognition Font.................. 17
- Setup the Text Output File.................... 17
- OCR the full page............................. 17
- Selecting areas to OCR or Cutout.............. 18
- Canceling a selected area..................... 18
- OCR of selected area.......................... 18
- Unloading an OCR Font......................... 18
- Saving a selected area........................ 18
- Training a New OCR Font....................... 19
- Saving the New OCR Font....................... 19
-
-
-
- Copr (c) 1984-1990 Solution Technology, Inc. page 2
- OCRSHR22.ZIP
-
-
- More about OCRSHARE's File Finders................ 19
- Editing the File Name......................... 20
- Over typing the File Name..................... 21
- Incrementing Numbered File Names.............. 21
- Graphics Editing.................................. 21
- Invert Page................................... 22
- Flip Page Vertical............................ 22
- Erase Inside.................................. 22
- Erase Outside................................. 22
-
- OCR Notes............................................ 24
- Looking at the output text file................... 24
- Additional Information on Training................ 24
- Naming the ink blots during training.......... 24
- Proper training and symbol diversity.......... 25
- Making an Alphabet Page....................... 25
- Point sizes and Scanning DPI.................. 25
- About Autoskip and Manual Training Modes...... 26
- More on Autoskip Training..................... 26
- About the Interactive Training screen......... 26
- Training Your Own Fonts for Recognition....... 27
- Using Your New OCR Font....................... 27
- Multi font Capability......................... 27
- Handling Special OCR Problems..................... 30
- Broken Letters................................ 30
- Broken Dot Matrix............................. 30
- Run together Letters.......................... 30
- Big Dirt Spots................................ 30
- Underscores................................... 31
- The % symbol.................................. 31
- Italics....................................... 31
- Advanced Training Methods......................... 31
- Training Foreign Characters................... 31
- Training from FRENCH.ATX...................... 32
- Training Foreign Language Letters............. 32
- Font Editing for correct OCR output........... 32
- Using your new French OCR font................ 33
- Training Special Symbols.......................... 33
- Ocr Tuneup Fonts.................................. 33
- Creating a tuneup font........................ 34
- Derivative Fonts.................................. 34
- Testing an Existing Font...................... 34
- Editing the font.............................. 35
- Retraining your font.......................... 35
-
- Trouble Shooting..................................... 36
- OCRSHARE V2.2 Registration Form...................... 44
- Note To Retail Dealers............................... 45
- Note To Shareware Dealers............................ 45
-
-
-
-
-
-
-
-
- Copr (c) 1984-1990 Solution Technology, Inc. page 3
- OCRSHR22.ZIP
-
-
- Copyright Acknowledgements
- ==========================
-
- MS-DOS is a registered trademark of Microsoft Corporation. IBM,
- PC-XT and PC-AT are registered trademarks of International Busi
- ness Machine Corporation. PC Paintbrush, PC Paintbrush +, and
- Publisher's Paintbrush are registered trademarks of ZSoft Corpo
- ration. PageMaker is a registered trademark of Aldus Corpora
- tion. Ventura Publisher is a registered trademark of Xerox
- Corporation. H.P. is a registered trademark of Hewlett-Packard
- Company.
-
- License
- =======
-
- OCRSHARE (c) 1986 ,1987, 1988, 1989, 1990 is copyrighted software
- program product of Solution Technology, Inc.(STI), Boca Raton,
- Florida. OCRSHARE is being distributed under the shareware dis
- tribution process and is intended for the personal use and enjoy
- ment of the recipient.
-
- OCRSHARE is copyrighted and has been released for distribution as
- SHAREWARE. Please note that a great deal of effort and time has
- invested in the development of this program. You are granted a
- license to try OCRSHARE for a reasonable trial period without
- risk.
-
-
- 1. No organization, individual or other entity may reproduce,
- print, duplicate, copy, or distributed the program, manual
- or any ancillary file herein for any commercial purpose or
- personal gain or commercial gain whatsoever without the
- expressed written permission of Solution Technology, Inc.
-
- 2. STI hereby grants the user a single user license for the
- users private use and enjoyment, the programs, documents and
- Aancillary files herein.
-
- 2. The user is additionally granted the right to and encouraged
- to:
-
-
- a. print a hard copy version of this manual for his own
- use.
- b. give, copy, upload and otherwise distribute WITHOUT
- CHARGE the complete and full and true copy of the
- OCRSHRxx.ZIP file.
-
-
- 3. STi makes no warranty, either expressed or implied, with
- respect to the shareware software described herein, its
- quality, performance, marketability, or fitness for any
- particular purpose.
-
-
-
-
- Copr (c) 1984-1990 Solution Technology, Inc. page 1
- OCRSHR22.ZIP
-
-
- 4. The user acknowledges that STi is not liable for any dam
- ages, either consequential or direct, arising out of the use
- of this Software. The user assumes complete responsibility
- for any decisions made or actions taken based on information
- obtained or distributed through use of this software product
- and any accompanying instructional or reference materials
- provided by Solution Technology, Inc.
-
- Bulletin Board License
- ----------------------
-
- 1. Operators of electronic bulletin boards (Sysops) are
- hereby licensed and encouraged to post OCRSHARE for
- downloading by their users subject to the following
- provisions. In addition sysops are encouraged to keep
- the uploaded .ZIP file name OCRSHRxx.ZIP where xx
- represents this version of OCRSHARE (eg OCRSHR22.ZIP)
- so your users will know if they have the latest ver
- sion.
-
- 2. OCRSHARE may be uploaded to and downloaded from commer
- cial systems such as CompuServe, the Source, and BIX,
- so long as the only charge paid by the subscriber is
- for on-line time and there is no charge for the pro
- gram. Those copying, sharing, and/or electronically
- transmitting the program are required not to delete or
- modify the copyright notice and restrictive notices
- from the program or documentation; anyone doing so will
- be treated as a contributory copyright violator.
-
- 3. If you, as a BBS sysop or user, are passing this pro
- gram on others, uploading it to a bulletin board sys
- tem, or including it in a users group library, do not
- separate the files contained in the distribution ar
- chive - pass the entire archive on to the intended
- party. This ensures that those who receive the program
- will have all the correct configuration utilities and
- documentation necessary to get OCRSHARE up and running
- quickly. A listing of what files you should have and
- the purpose of each is listed later in this manual.
-
- 4. The OCRSHARE documentation may not be modified by
- users. The program may not be separated from the docu
- mentation when distributed. Printed or Photocopies
- ("Xeroxed") copies of the OCRSHARE documentation (i.e.,
- this manual) may not be distributed or sold without the
- written permission of STI.
-
- 5. This license to use OCRSHARE does NOT include any right
- to distribute or sell OCRSHARE for commercial purposes,
- gain, compensation or profit. No entity or person other
- than Solution Technology, Inc. may accept payment or
- royalties for this program without an expressed written
- agreement with STI. Distribution terms are given below.
-
-
-
- Copr (c) 1984-1990 Solution Technology, Inc. page 2
- OCRSHR22.ZIP
-
-
- Public Domain and Shareware Libraries
- -------------------------------------
-
- 1. Distributors of "public domain" or user-supported
- software libraries must obtain written permission to
- distribute copies of OCRSHARE. No one may use OCRSHARE
- as a promotion for any commercial venture or as an
- enticement for the user to pay for any program,
- product, or service unless they have received the
- express written permission of Solution Technology, Inc.
-
-
- 2. You must obtain written permission from Solution Tech
- nology to distribute OCRSHARE. Please use the vendor
- application supplied near the end of this user manual.
- If you do not receive a reply, write again: our silence
- does NOT constitute permission, and you may not dis
- tribute, "pending" receipt of permission.
-
- 3. A maximum disk fee as set by Solution Technology in the
- above vendor contract must not be exceeded. OCRSHARE
- may not be included on any disk sold for more than this
- maximum. Major CD-ROM or optical disk libraries are
- exempt from this restriction, provided that they have
- STI's permission to distribute OCRSHARE.
-
- 4. Vendors may not modify or delete ANY files on the disk.
- Vendors may add a "GO" program, and/or a reasonable
- number of small text files designed to assist or pro
- vide a service to the user, but these added files must
- be easily identifiable and end-users must be allowed to
- delete the added files.
-
- 5. Vendors must make a reasonable effort to distribute
- only the most recent versions of OCRSHARE. All vendors
- who have requested and received written permission to
- distribute OCRSHARE will receive new MAJOR releases as
- they are issued.
-
- 6. All disk vendors must comply with any and all vendor
- guidelines vendor requirements set forth by the Associ
- ation of Shareware Professionals (ASP); for more infor
- mation about ASP, contact its chairman, Jim Button, at
- Buttonware in Seattle. Violation of any ASP guideline
- or requirement automatically revokes permission to
- distribute OCRSHARE.
-
- Until formal requirements are adopted by the ASP, you
- must comply with the following guidelines:
-
- Vendors must make an attempt to educate users on the
- nature of Shareware. Catalogs, advertisements, order
- forms, and all disks sold should contain ASP-approved
- or recommended wording describing the nature of share
- ware, and should explicitly state that no part of disk
-
-
- Copr (c) 1984-1990 Solution Technology, Inc. page 3
- OCRSHR22.ZIP
-
-
- sale revenues are paid to the programs' authors. When
- vendor catalogs or advertisements carry both Shareware
- and PD programs, the Shareware programs must be differ
- entiated from the public domain programs in some way
- (in the description, with an asterisk, by listing the
- registration fee, etc.).
-
- Association of Shareware Professionals (ASP)
- --------------------------------------------
-
- OCRSHARE is a Shareware program conforming to standards as
- established by the Association of Shareware Professionals
- (ASP) located at 325 118th Ave. S.E., Suite 200, Belleview,
- WA 98005.
-
-
- Registration
- ------------
- If you find OCRSHARE useful you are encouraged to register
- you copy. The base registration fee is $45. With registra
- tion you will recieve in return the latest version of OCR
- SHARE called ATXSHARE which does not have the shareware nags
- and allows graphics translations between .PCX, IMG and .TIF
- image files. Additionally registration allows you to obtain
- a full set of pretrained Advantex OCR fonts for only $25.
- You will find a registration form at the end of this docu
- ment which will aid you in registering your copy of OCR
- SHARE.
-
- Please keep in mind that we must have a registration form on
- file for you before we can offer product support.
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- Copr (c) 1984-1990 Solution Technology, Inc. page 4
- OCRSHR22.ZIP
-
-
- Introduction
- ============
-
- OCRSHARE is a complete shareware version of STI's Advantex
- OCR (Optical Character Recognition) package which is de
- signed to convert scanned image files into text and picture
- files for use by other programs in your computer. After
- using OCRSHARE to input a document, you can manipulate the
- extracted data using your favorite word processor, desk top
- publishing, or graphics programs.
-
- OCRSHARE is smart, fast and flexible but still easy to use.
- We're sure you will find OCRSHARE an indispensable tool to
- reduce those endless hours at the keyboard attempting to
- computerize volumes of printed material.
-
-
- Why OCR Software?
- -----------------
-
- When it comes to inputting existing documents and hard copy
- data into your computer, you basically have only two
- options: either enter the data "manually" by retyping it
- into your system, or have an Optical Character Recognition
- (OCR) system do it for you.
-
- The page image is first scanned into a bitmapped image file
- using the utilities provided with your scanner. Each image
- file represents a single page and contains many rows of
- black and white dots (called pixels) which come from detect
- ing the light reflected from the characters, lines, logos,
- dirt and other ink patterns on the page. While these graphic
- images look like characters on your graphics display, they
- are still only collections of dots.
-
- OCRSHARE's major purpose is to learn to identify these ink
- patterns and output its ASCII equivalent into a text file.
- For example, a graphic "A" becomes an ASCII "A", and so on
- until each character has been translated or converted to
- text.
-
- OCRSHARE can read using its dictionarys of existing symbols
- (stored in .FTF files) or easily learn to recognize new
- symbols, company logos or characters in other accented
- alphabets such as German, Greek, Spanish, French and Rus
- sian. Thus graphic symbols and characters are translated
- into their letter equivalents. Note that this process is
- letter and word recognition, not language translation.
- OCRSHARE can even be taught to read hand printed letters if
- the penmanship is neat and consistent.
-
-
-
-
-
-
-
- Copr (c) 1984-1990 Solution Technology, Inc. page 5
- OCRSHR22.ZIP
-
-
-
-
- Features
- --------
- Features vary from product to product as follows
-
- OCRSHARE Features
- -----------------
- - Reads scanned images TIF, PCX and IMG files
- - Output ASCII, Wordstar or Word Perfect files
- - OCR Features
- - Trainable OCR Fonts
- - Handles Skewed Text
- - Reads Monospaced, Proportional and Typeset Text
- - Comes with Helvetica, Times Roman & Courier OCR Fonts
- - Adjustable Automatic Dirt/Spotting Filter
- - Adjustable Automatic Graphics/Line/Box Filter
- - Image Cleanup features
- - Erase Inside
- - Erase Outside
- - Rotate Page
- - No Direct Scanner Support
-
- ATXSHARE Features ($45 Registeration Required)
- ------------------------------------------
- - All of the OCRSHARE Features above
- - Latest Version
- - No Shareware Nags
- - Save/Cutout TIF, PCX, IMG Files (eg file conversion)
- _ Optional OCR font library ($25)
-
- AdvanTex OCR ($395)
- -------------------
- - Direct Scanner Support for Cannon, Microtek, HP Scanjet
- and ScanJet Plus, Panasonic RS505,506, Chinon 2000 and
- 3000, Ricoh, Mitsubishi full page Handscanner, and
- others.
- - Batch Mode Support
- - Multipage Support
- - Save/Cutout TIF, PCX, IMG Files (eg file conversion)
- - Ocr Font Library included
- - Other extensions.
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- Copr (c) 1984-1990 Solution Technology, Inc. page 6
- OCRSHR22.ZIP
-
-
- Ver 2.2 Enhancements
- --------------------
-
- The following improvemets, enhancements, features and capa
- bilities were added to the AdvanTex Release 2.2 base program
- in upgrade to earlier versions.
-
- - Added a line separation preprocessor(Edit Menu) for
- text which has lines joined vertically by a few touch
- ing letters.
- - Added a ligatured (joined character) error detector
- which properly marks untrained ligatures in the output
- text (with XX).
- - Added proper " processing
- - Added word break test processor(Edit Menu)
- - Improved context rules for capitalization and numbers
- - Improved recognition at 200 dpi
- - Improved recognition rules for all recognition
- - Improved and expanded tolerance for line skew
- - Improved word space processing.
- - Upgraded all Scanner Drivers
- - Upgraded all OCR Fonts
-
- How to Use This Manual
- ======================
- Most PC users, novice to expert, generally hate to read
- manuals. Right?! That's why we've made OCRSHARE as
- intuitive and easy to use as possible.
-
- We have included easy-to-understand lessons in our tutorial
- section. These lessons provide you with clear, step-by-step
- instructions on using all the simple, yet powerful features
- of OCRSHARE. Even if you are an expert computer user, you
- should at least glance through the lessons to get acquainted
- with OCRSHARE. Once you understand the basics, you can then
- use the manual's index to quickly find out how to make
- OCRSHARE do what you want it to do.
-
- Lessons
- -------
- You learn even more about OCRSHARE by visiting the Tutorial
- section and going through the step-by-step examples in the
- Lessons.
-
- Trouble Shooting
- ----------------
- Help is always available while running OCRSHARE. Simply
- press "Alt" and "H" together and a Help menu will appear on
- your screen.
-
-
-
-
-
-
-
-
- Copr (c) 1984-1990 Solution Technology, Inc. page 7
- OCRSHR22.ZIP
-
-
- Installation
- ============
-
- Computer Requirements
- ---------------------
-
- Your computer should have at least the following capabili
- ties.
-
- - An 8088, 8086, 80286 or 80386 based IBM/PC or
- compatible computer. The faster the CPU chip the
- better. An 8088/4.77 mhz will work, but will be
- unbearably slow.
-
- - DOS 3.0 or higher operating system.
-
- - A full 640 K of main memory.
-
- - A hard disk drive with at least 2-3 megabytes of
- free space.
-
- - Display card/monitor with graphics capability
- (Hercules monochrome, CGA, EGA, VGA, etc.)
-
-
- Running from Hard Disk or Ram Disk
- ==================================
-
-
- OCRSHARE can be installed in virtually any directory on any
- hard disk drive of your system, even copied and run from a
- RAM disk (most effective when running the disk EMS simula
- tor).
-
- Notation
- --------
-
- - When you see [sp] the space indicated is mandatory so
- press the space bar on the keyboard.
-
- - When you see [enter] press the key marked ENTER on your
- keyboard.
-
- - When you see a letter preceded by the word CTRL (as in
- [CTRL-C], hold down the key marked "CTRL" or "Control"
- on your keyboard and simultaneously press the letter
- key indicated (in this case 'C'.
-
-
-
-
-
-
-
-
-
-
- Copr (c) 1984-1990 Solution Technology, Inc. page 8
- OCRSHR22.ZIP
-
-
-
- Hard Disk Installation
- ----------------------
-
- A summary of the installation steps are listed below.
- Installing OCRSHARE is quite simple and mostly involves
- copying files.
-
-
- 1. Create a subdirectory called OCRSHARE for OCRSHARE on
- your hard disk. You may use another directory name,
- although the examples assume OCRSHARE is used.
-
- 2. Use PKUNZIP to expand the files of OCRSHRxx.ZIP into
- the directory.
-
- 3. Modify CONFIG.SYS if necessary.
-
- 4. Log into the OCRSHARE directory and run the program by
- typing
-
- OCRSHARE[enter]
-
-
- Modifying CONFIG.SYS
- ====================
-
- You should have a CONFIG.SYS file in your root directory.
- CONFIG.SYS is a file which is read by the computer during
- the power up boot and CTRL-ALT-DEL three-key boot sequence.
- CONFIG.SYS, which must be on the boot disk, contains
- configuration information for DOS as well as specifications
- for the loadable device drivers.
-
- While your CONFIG.SYS may contain more statements, the
- statements used by OCRSHARE are:
-
-
- FILES = <number of simultaneous files permitted>
- BUFFERS = <number of disk buffers available>
- DEVICE = <EMS memory card driver filename>
-
-
-
- FILES & BUFFERS
- ---------------
-
- We recommend the following minimum values for the FILES and
- BUFFERS statements in CONFIG.SYS. More that the minimum is
- perfectly OK.
-
- FILES = 20
- BUFFERS = 20
-
-
-
-
- Copr (c) 1984-1990 Solution Technology, Inc. page 9
- OCRSHR22.ZIP
-
-
- Note: If there are multiple FILES and or BUFFERS lines in
- your CONFIG.SYS (usually put there by different application
- installation programs) keep only the largest and delete all
- other lines.
-
- Starting OCRSHARE
- -----------------
-
- To start OCRSHARE type
-
- OCRSHARE[enter]
-
- OCRSHARE will normally automatically select the proper
- internal graphics display driver and display mode for your
- graphics card.
-
-
- Note: Color monitors are supported in black and white (2
- color) modes only since color modes are too memory intensive
- for large images.
-
- Forcing a Display Adaptor
- -------------------------
-
- If you have dual screens, an unknown brand display card, or
- desire a different display mode, you may have to override
- OCRSHARE's auto selection. To do this, simply add the appro
- priate switch parameter after OCRSHARE on the command line
- when you start OCRSHARE.
-
- For Example, The following forces OCRSHARE to use the 2
- color 640X350 EGA driver.
-
- OCRSHARE /E:6
-
- Upon entering this command the following message will be
- displayed:
-
- IBM Enhanced Graphics Adapter, 640 x 350 2-color
- to quit -press CTRL-C
- to continue -press any other key
-
- If you signal continue, OCRSHARE will set up for the card
- you specified specified. The hardware configuration is
- saved in the file OCRSHARE.CFG so that the next time you
- start OCRSHARE your display adaptor is automatically select
- ed.
-
- If OCRSHARE seems to hang on start up, you may still have an
- incompatible display card. Press CTRL-X to get back to the
- prompt or reboot and append the correct option switch for
- your specific display adaptor.
-
-
-
-
-
- Copr (c) 1984-1990 Solution Technology, Inc. page 10
- OCRSHR22.ZIP
-
-
- OCRSHARE /E:6 [enter] if you have an EGA compatible card
- OCRSHARE /C:1 [enter] if you have a CGA compatible card
- OCRSHARE /E:7 [enter] if you have a VGA compatible card
- or
- OCRSHARE /xxx [enter] (where xxx is one of the settings below)
-
- Screen Selections
- -----------------
-
- The screen selections supported at the time of the printing
- of this manual are listed below.
-
- Option Graphics Card Screen Size
-
- /A AT&T Graphics Adapter 640x400
- /A:1 AT&T Graphics Adapter 640x400
- /C:1 IBM Color Graphics Adapter (CGA) 640x200
- /E:1 IBM Enhanced Graphics Adapter (EGA) 640x350
- /E:6 IBM Enhanced Graphics Adapter (EGA) 640x350
- /E:7 IBM PS/2 MultiColor Graphics Array(MCGA) 640x480
- /G MDS Genius Display Adapter 736x1008
- /H Hercules/AST Monochrome 720x348
- /O Toshiba 3100 640x400
- /T Tecmar Graphics Adapter 720x352
- /T:1 Tecmar Graphics Adapter 720x352
- /T:2 Tecmar Graphics-Adapter 720x704
- /S STB GraphicsPlus-II Adapter 640x352
- /S:1 STB GraphicsPlus-II Adapter 640x352
- /S:5 STB GraphicsPlus-III Adapter 640x400
- /W Wyse WY-700 640x400
- /W:1 Wyse WY-700 640x400
- /W:2 Wyse WY-700 800x400
- /W:3 Wyse WY-700 1280x800
- /X IBM 3270 PC 720x350
- /X:1 IBM 3270 PC 720x350
-
- For Compaq Monochrome Graphics Monitors - Use the following
- settings
-
- /A:1 Portable II and III plasma 640x400
- /C Portable or CGA compatible 640x200
- /E:1 EGA with monochrome screen 640x350
-
- Note: VGA will always default to MCGA (/E:7)
-
- Mice
- ----
- Although OCRSHARE does not at this time support a mouse
- directly, it will work if you have a programmable mouse
- hooked up to your computer. Such a mouse can emulate the
- directional keys of the keyboard (left, right arrow, etc.).
-
-
-
-
-
-
- Copr (c) 1984-1990 Solution Technology, Inc. page 11
- OCRSHR22.ZIP
-
-
- Networks
- --------
-
- OCRSHARE is not generally network aware but can be used from
- a server. Secondly if your network drivers occupy too much
- room in the first 640K OCRSHARE will not have suffient room
- to run. Also be aware of OCRSHARE's use of EMS memory with
- particular attention to server disk caching. To install
- simply copy all of the programs and files to a server direc
- tory.
-
- EMS Memory
- ----------
-
- EMS memory provides up to 8 megabytes of additional memory
- to any IBM/PC compatible computer. This memory is not
- directly addressable by computer; it must be paged in and
- out to a 64 Kilobyte buffer by software calls to an EMM
- device driver. This driver, installed in CONFIG.SYS, must
- conform to the Lotus, Intel, Microsoft (LIM) specification
- 3.3 or 4.0.
-
- OCRSHARE and EMS memory
- -----------------------
-
-
- OCRSHARE uses the EMM driver to access the EMS memory.
- OCRSHARE stores the full sized display image and the zoomed
- out image in the EMS memory. While you do not need external
- EMS memory for occasional work, you will find the perform
- ance advantages of a real EMS card significant if you are
- trying to process large numbers of pages.
-
- EMS Memory Requirements
- -----------------------
-
- OCRSHARE images require memory real or simulated EMS memory.
- The chart below gives you some idea of how much memory you
- will need for different sized images.
-
- Image Resolution 8.5x11 in
-
- 200 Dpi about 500K
- 300 Dpi about 1.1 Meg
- 400 Dpi about 1.8 Meg
-
- The formula for calculating the number of bytes required is:
- (Resolution x Resolution x length x width) / 8 (bits per
- byte).
-
- Note: If you have insufficient real EMS memory to store the
- size of the image you are scanning, you will get an EMS
- allocation error. You can still process the image by setting
- OCRSHARE to simulate the EMS environment on disk, exiting
- and restarting OCRSHARE.
-
-
- Copr (c) 1984-1990 Solution Technology, Inc. page 12
- OCRSHR22.ZIP
-
-
-
- Types of EMS Memory
- -------------------
-
- Depending on whether you are using a real EMS card, an
- extended memory EMS simulator, or a disk EMS simulator, the
- memory used by the EMS system will, respectively, be paged
- in and out to a real EMS card, or to extended memory, or to
- your hard disk. OCRSHARE's internal EMS simulator is a
- highly optimized form of a disk EMS simulator.
-
- OCRSHARE's EMS simulator
- ----------------------------
-
- If you do not have real EMS memory active, OCRSHARE will use
- the free space on your hard disk as temporary virtual memory
- and simulate the EMS function internally.
-
- If, for example, you are running OCRSHARE from drive C:
- which has, let's say, 400K of hard disk space remaining (use
- CHKDSK to determine), then there will not be enough space to
- run OCRSHARE. If you have another hard drive (e.g., D:) with
- more space in it, you can log into that drive (D:), and run
- OCRSHARE as long as OCRSHARE.EXE is in the path. Although
- OCRSHARE is installed on C:, because it is called from
- another drive (D:), the remaining space on that drive (D:)
- will be used as the simulator.
-
- Note: Make sure your hard disk has 1-2 megabytes of free
- space on it. You can use DOS's CHKDSK utility to see how
- much disk space is available. The OCRSHARE simulator is in
- effect while OCRSHARE is running. When OCRSHARE exits, all
- of OCRSHARE's simulator files are deleted.
-
- Installing a disk cache in EMS Memory
- -------------------------------------
-
- SOME DELAYED WRITE DISK CACHE PROGRAMS, WHEN INSTALLED TO
- USE EMS MEMORY, DO NOT SHARE THE EMS RESOURCE PROPERLY WITH
- OTHER PROGRAMS AND CAN CRASH YOUR HARD DISK.
-
- If you try to run ANY EMS application with an "ill- behaved"
- delayed write cache program installed in EMS, it will likely
- result in a serious disk crash with corrupted FAT tables.
- "Well-behaved" EMS disk caching programs will properly save
- and restore EMS information as it operates.
-
- While cache programs usually operate properly when set to
- WRITE THROUGH mode (where all writes are done immediately),
- the problem occurs when the cache program is set to do
- DELAYED WRITES to the disk. This requires that the cache use
- a clock tick to interrupt the application program to do the
- disk writes. If the cache program DOES NOT properly save
- and restore EMS page information during this interrupt, it
- WILL cause a disk crash.
-
-
- Copr (c) 1984-1990 Solution Technology, Inc. page 13
- OCRSHR22.ZIP
-
-
-
- This warning does not apply to WRITE THROUGH cache programs
- or VDISK programs which are generally well behaved. If you
- want to use DELAYED WRITES with your disk cache program,
- install the cache memory buffer Extended memory. Besides
- using the unused 384K that is typically there, this leaves
- the EMS Expanded Memory free for OCRSHARE to use.
- Scanner Installation
- --------------------
-
- OCRSHARE and ATXSHARE are shareware versions of Solution
- Technology's Advantex OCR product which directly supports
- most desktop scanners. Due to the fact that the of scanners
- is often complicated and (and costly) in terms of telephone
- support these shareware products are not distributed with
- scanner drivers or scanning capability. To get direct
- scanner support you must purchase regular AdvanTex. (See the
- file README.1ST).
-
- OCR on prescanned Images
- ------------------------
- You must scan images using your scanners built in utility
- and save them in a Tiff, PCX or IMG image file format.
- OCRSHARE can then load these images and perform OCR or basic
- image manipulation.
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- Copr (c) 1984-1990 Solution Technology, Inc. page 14
- OCRSHR22.ZIP
-
-
-
- Tutorial
- ========
-
- With OCRSHARE, you are only a few keystrokes from obtaining
- usable text output in the form of either an ASCII text file
- from pages scanned and saved in PCX, TIF or IMG image file
- formats.
-
- This Tutorial is designed to get you up and running with
- OCRSHARE in the shortest possible time. If you feel that you
- need more detailed instructions, or you are going to use
- OCRSHARE almost immediately to do production work, simply
- begin with the regular Tutorial (Tutorial) Section.
-
- Starting OCRSHARE
- =================
-
- 1. Start OCRSHARE by typing OCRSHARE at the DOS prompt,
- then press [enter].
-
-
- Note: As long as OCRSHARE.EXE is in the PATH command,
- OCRSHARE can be called from any drive or sub-directory.
- Please refer to your DOS manual for more information about
- the use of PATH.
- Moving Around
- =============
-
- To move the selector bar, use the up, down, left, and right
- arrow keys on your numeric/arrow key pad. The [PgUp] and
- [Home] key will move the cursor to the top of a menu, and
- the [PgDn] and [End] keys will move the cursor to the bottom
- of a menu.
-
- The Six Basic Keyboard Keys
- ---------------------------
- You normally only need to use the following six keys to
- execute virtually any function in OCRSHARE.
-
- Arrow Keys - The up and down arrow keys move the menu
- cursor bar up and down any menu. The left and right
- arrow keys move the cursor bar from menu to menu.
-
- Escape Key - The Esc (escape) key will acts as both your
- dismiss menu (or cancel) key, and as your attention
- key. By pressing the Esc key, OCRSHARE will stop what
- it is doing and return to the previous menu screen.
-
- [enter] Key - The [enter] key is used to select, execute or
- activate the action highlighted by the cursor. Once you
- have placed the cursor on the action of choice, press
- [enter] to activate that action. The [enter] key may be
- identified as RETURN or as an arrow on your keyboard.
-
-
-
- Copr (c) 1984-1990 Solution Technology, Inc. page 15
- OCRSHR22.ZIP
-
-
- Short Cut Keys
- --------------
- The following keys allow you to bypass moving around the
- menus.
-
-
- Home Key - This key moves you to the top of any menu or
- list.
-
- Home Key - This key moves you to the bottom of any menu
- or list.
-
- PgUp/PgDn - These keys move you up or down one list page on
- a scrollable list. If the list is not scrollable these
- keys act the same as the Home/End keys.
-
- CTRL Keys - A control key shortcut is indicated on the
- menus by a ^ symbol followed by a letter. To execute
- the function associated with a control key, hold down
- the CTRL key and press the letter indicated on the
- menu.
-
- Function Keys - A function key shortcut is indicated on the
- menus by a F and followed by a number. To execute the
- function associated with an F key, simply press the
- indicated function key on your keyboard.
-
- CTRL PgDn - When drawing a capture box with Select Area F9,
- this combination will move the cursor box down the page
- in much larger steps than the down arrow will.
-
- CTRL PgUp When drawing a capture box with Select Area
- F9, this combination will move the cursor box
- down the page in much larger steps than the
- up arrow will.
-
- CTRL -> When drawing a capture box with Select Area
- F9, this combination will move the cursor box
- to the right in much larger steps than just
- the right arrow will.
-
- CTRL <- When drawing a capture box with Select Area
- F9, this combination will move the cursor box
- to the left in much larger steps than just
- the left arrow will.
-
- Exiting OCRSHARE
- ================
-
- When it is time to exit OCRSHARE, use the arrow keys to
- position the cursor over the Exit to DOS command under the
- INFO main menu. Press [enter]. You may also exit OCRSHARE by
- pressing the CTRL-X control key shortcut.
-
-
-
-
- Copr (c) 1984-1990 Solution Technology, Inc. page 16
- OCRSHR22.ZIP
-
-
- Perform OCR on an Image File
- ============================
-
- Load a scanned image file
- -------------------------
- 1. Press F5 to select the image file loader
- 2. Press [enter] to select the OCRSHARE file format.
- 3. Press the Down Arrow Key to move the selector bar over
- the PAGE1.ATX file name
- 4. Press [enter] Once to select that file to load.
- 5. Press [enter] Again to load the selected file.
-
- Exploring the loaded image
- --------------------------
- 1. Load an image file as described above
- 2. Select "Select Area" or press F9. A cross hair cursor
- will appear.
- 3. Use the Arrow Keys to move the cross hair cursor to
- interesting parts of the scanned image.
- 4. Press F10 to "Zoom In" on selected area
- 5. Press F10 again to "Zoom Out" to the full page view.
- 6. Press [esc] key to cancel the select area.
-
- Load an OCR Recognition Font
- ----------------------------
- 1. Select "Font Settings..." on the OCR menu, or press
- CTRL-F.
- 2. Select "Load Existing Font".
- 3. Move the cursor bar so that it is over TMSROMAN.FTF in
- fonts list.
- 4. Press [enter] once to select the font.
- 5. Press [enter] a second time to load the font.
- 6. Press Esc to get back to the OCR main menu.
-
- Setup the Text Output file
- --------------------------
- 1. Select "Text Settings..." on the OCR menu, or press
- CTRL-T.
- 2. Select "Set Output Text File".
- 3. Type DEMO at the File: prompt (the .TXT extension will
- be appended for you); press [enter].
- 4. Press [Esc] to get back to the OCR main menu.
-
- OCR the full page
- -----------------
- 1. Load one or more OCR Fonts.
- 2. Load a scanned page image file (PAGE1.ATX for
- tutorial).
- 3. Setup text output file name.
- 4. Select "Convert to Text" or press F4 to perform OCR on
- the full page and put the results into a newly created
- file called DEMO.TXT. This output text file can be sent
- to the printer, viewed from DOS, or imported into your
- favorite word processor.
-
-
-
- Copr (c) 1984-1990 Solution Technology, Inc. page 17
- OCRSHR22.ZIP
-
-
- Selecting areas to OCR or Cutout
- ------------------------------------------
- 1. Press F9 (Select area)
- 2. Move the cross hairs with the arrow keys.
- 3. Anchor the first corner by pressing the [enter] key.
- 4. Pull the tiny icon box (dragon) using the arrow keys
- until it surrounds the area desired.
- 5. Use the "+" and "-" keys to move the dragon to
- different corners of the larger capture box.
- 6. When you are satisfied with the positions of all
- corners, lock the capture box in place by pressing the
- [enter] key.
-
- Canceling a selected area
- -------------------------
- 1. Press F9 (Select area)
- 2. Press [enter] twice.
- 3. Since the select area rectangle has no width or height
- it is now canceled and full page is implied.
-
- Ocr of selected area
- --------------------
- 1. Mark an area of interest as described above.
- 2. Press F4 to perform OCR on the selected area and
- either append to or overwrite the existing file called
- DEMO.TXT. This output text file can be sent to the
- printer, viewed from DOS, or imported into your
- favorite word processor.
-
- Unloading an OCR Font
- ---------------------
- 1. Select "Font Settings..." on the OCR menu or press
- the CTRL-F key.
- 2. Use Down Arrow to move the selector to TMSROMAN.FTF.
- 3. Press [enter] to pop up the font management menu.
- 4. Press [enter] to remove TMSROMAN.FTF
-
- Saving a selected area
- ----------------------
- 1. Mark an area of interest as described above.
- 2. Select "Save Area" or press F8.
- 3. Select a graphics format to save to. (OCRSHARE is
- OCRSHARE's fast load internal graphics file format.
- You must register to obtain the other formats
- listed).
- 4. Type a file name, say CUTOUT, at the File: prompt; the
- appropriate extension will be appended for you.
- 5. Press the [enter] key to use this file name.
- 6. The selected area of the image is now saved into the
- file CUTOUT.ATX, in the graphics file format you se
- lected.
-
-
-
-
-
-
- Copr (c) 1984-1990 Solution Technology, Inc. page 18
- OCRSHR22.ZIP
-
-
- Training a New OCR Font
- -----------------------
- 1. If you have not loaded PAGE1.ATX do so now as described
- above.
- 2. Use Select Area (F9) to draw a box around the letters.
- Make sure that you have some upper case or tall letters
- on each text line in the selected area.
- 3. Select "Font Settings..." or press CTRL-F.
- 4. If TMSROMAN.FTF is still loaded, unload it as described
- above.
- 5. Select "Make a New Font" and type in OCRFONT (or any
- other name you want).
- 6. Press the [enter] key to register that font for
- training.
- 7. Press [Esc] to get back to the OCR main menu.
- 8. Select "Train Font" or press F7 to begin training
- characters from the selected area into the newly
- created OCRFONT.FTF.
-
- a) As each letter is displayed and boxed, type the
- corresponding letter on the keyboard followed by
- [enter] key.
- b) If multiple letters are boxed, you can type them
- all in sequence, and press the [enter] key.
- c) If the symbol boxed has no equivalent keyboard
- letter, you should make up an unambiguous two or
- three letter sequence to name the letter, and
- press the [enter] key. Note that the next time
- that symbol appears for training you MUST use the
- SAME name you entered earlier.
- d) To skip over dirt, spots or badly broken letters,
- press [enter] without typing the name of a symbol.
- Only named symbols are added to the OCR font data
- base.
- e) Press the [Esc] key any time to stop training and
- get back to the OCR menu.
-
- Saving the New OCR Font
- -----------------------
- 5. When you are finished with training of the new font
- Select "Font Settings..." or press CTRL-F from the OCR
- menu.
- a) Select OCRFONT.FTF (eg the ocr font you have just
- trained).
- b) Move the selector bar to "Save Font" and press
- [enter]. Except for the letters OCRSHARE hasn't
- seen yet, you have just created a partially-
- trained font.
-
- More about OCRSHARE's File Finders
- ==================================
-
- OCRSHARE uses a menu window called a File Finder to select
- for loading and name files for saving. A File Finder dis
- plays a list of the available files of a given type in a the
-
-
- Copr (c) 1984-1990 Solution Technology, Inc. page 19
- OCRSHR22.ZIP
-
-
- currently specified directory. Explore this by the following
- exercise:
-
- 1. The cursor bar is initially located on the File:
- prompt bar. Press the up arrow key until the cursor
- bar is on the PATH: *.ATX prompt line.
- 2. Press the down arrow key until the cursor bar is in the
- box listing the OCRSHARE image file names.
- 3. Use the down arrow key again to move the cursor bar
- cursor bar down until it covers the file named
- PAGE1.ATX.
- 4. Press the [enter] key.
-
-
- Note that the file name and the cursor bar pop up to the
- File: prompt line. Press the [enter] key again to load the
- PAGE1.ATX file into display memory. While loading any file,
- OCRSHARE first displays a load progress window, then a
- Computing Full Page View progress window as it constructs
- the full page view of the image. When loading has finished,
- OCRSHARE will display the full page view on your monitor.
-
-
- 5. Press the ESC key to hide the OCR drop down menu as
- illustrated above. This lets you view the page without
- anything in the way.
-
- 6. Press the ESC key again and the OCR main menu will
- reappear.
-
- Editing the File Name
- ---------------------
- All OCRSHARE text fields have an easy to use, built in, line
- text editor. Notice the small vertical bar to the right of
- PAGE1.ATX. This means that the field editor is insert mode.
-
- 1. Press the [Ins] key several times to toggle between
- insert and overstrike mode.
-
- In insert mode the cursor indicates the point between
- letters where the next character will go. In overstrike
- mode the block cursor covers the character which will be
- overwritten. It indicates that you can edit the characters
- of the file name in this field.
-
- 2. Press the Home key to move to the beginning of the text
- field.
- 3. Press the End key to move to the end of the text field.
- 4. Press Left arrow key to move the text cursor (vertical
- bar) to the point between the 1 and the period. This
- is the position where we are going to begin editing.
- 5. Press the Backspace key to delete the 1. (You could
- have also moved to the point between the E and the 1
- and pressed the Del key).
- 6. Type the number 1 to reinsert the number 1.
-
-
- Copr (c) 1984-1990 Solution Technology, Inc. page 20
- OCRSHR22.ZIP
-
-
- 7. Type the letter a to change PAGE1.ATX to PAGE1a.ATX
- 8. Press the Backspace key to delete the a.
-
- Over typing the File Name
- -------------------------
- You can also type over the existing file name.
-
- 1. On any finder menu, move the selector bar to the File:
- field.
- 2. Press the Home key. The text cursor will move to the
- far left side of the file name.
- 3. Press [Ins] (insert) key until the text cursor changes
- to the type-over fode. OCRSHARE is ready to type over the existing charac
- ter or the entire file name.
- 4. Type in the filenames letters and press [del] key to
- delete any trailing letters.
- 5. Press [enter] to use this filename and the path speci
- fied by the Path: field.
-
- Incrementing Numbered File Names
- ---------------------------------
- OCRSHARE is designed to work easily with filenames that end
- in numbers. Since PAGE1.ATX ends with the number 1 to make
- it a numbered file. Note that the file extension (in this
- case .ATX) indicates the type of file information rather
- than just serving as the end of the filename.
-
- 1. Press the F2 function key on your key board. The file
- name will change to PAGE2.ATX.
- 2. Press the F10 function key and the File name changes
- back to PAGE1.ATX. You may increment or decrement the
- file name in this manner for all numbers between 0
- (zero) and 9999.
-
- Notes:
-
- 1. Each type of finder has a separate data storage areas
- for their current information. This means that you can
- set the path for the OCRSHARE finder to one directory
- and the path of the PCX finder to another directory and
- OCRSHARE will keep them straight.
- Graphics Editing
- ================
-
- The following graphics menu performs editing functions that
- modify pixels, areas or the entire page in the display
- memory. This enables you to modify scanned images to improve
- the quality of graphics and OCR projects. OCRSHARE can load
- and save images to or from PCX, TIFF, or (our own) OCRSHARE
- file formats thus allowing you to convert image files for
- different purposes.
-
-
-
-
-
-
- Copr (c) 1984-1990 Solution Technology, Inc. page 21
- OCRSHR22.ZIP
-
-
- Invert Page
- -----------
- Invert Page which makes negatives of positive (black on
- white) images.
-
- 1. If you don't have the image PAGE1.ATX .ATX still
- loaded, load it again.
- 2. Select Invert Page by moving the cursor bar over Invert
- Page and pressing [enter].
- 3. The black and white image reverses itself.
- 4. Press [enter] again and the image reverses, bringing it
- back to its original state. It is that easy.
-
- Flip Page Vertical
- ------------------
- Flip page vertical rights pages which were accidentally
- scanned upside down.
-
- 1. Select Flip Page Vertical and the image will turn
- upside down so that the top is now the bottom.
- 2. Press [enter] again and the image returns to its
- original upright position.
-
- Erase Inside
- ------------
- Erase Inside erases the area INSIDE a "Select Area"
- rectangle and is good for selectively eliminating unwanted
- parts of the scanned page. The area selected can be as small
- as one bit or as large as the entire page.
-
- 1. Use Select Area to enclose an area inside the monitor
- screen in the image as shown above.
- 2. Select Erase Inside. OCRSHARE will erase everything
- within the box. The monitor in the image now appears
- clear with a black border around it as shown above.
-
- Erase Outside
- -------------
- Erase Outside erases the area OUTSIDE a "Select Area"
- rectangle and is perfect for eliminating edge trash. The
- area selected can be as small as one bit or as large as the
- entire page.
-
- 1. Use Select Area to enclose an area inside the monitor
- screen in the image as shown above.
- 2. Select Erase Outside. OCRSHARE will erase everything
- BUT what is inside the select area box. The monitor in
- the image now appears clear with a black border around
- it as shown above.
-
- Notes:
-
-
-
-
-
-
- Copr (c) 1984-1990 Solution Technology, Inc. page 22
- OCRSHR22.ZIP
-
-
-
- 1. Your original image, PAGE1.ATX, has not been modified,
- and will not be until you elect to save over it.
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- Copr (c) 1984-1990 Solution Technology, Inc. page 23
- OCRSHR22.ZIP
-
-
- OCR Notes
- =========
-
- 1. Due to memory constraints, we provide direct, internal,
- support for only a few word processor formats at this
- time. If you are going into a different word processor,
- you should select the Wordstar document format, and the
- import utility provided with your word processor to
- convert the ASCII text file into your word processor's
- internal text format.
-
- 2. OCRSHARE automatically avoids small spots of dirt as
- well as pictures in PAGE1.ATX.
-
- 3. You may occasionally see extra spaces or symbols
- seemingly run together in the Progress Monitor window.
- In addition you may even see two or three lines as
- OCRSHARE wraps the text output on the display. Don't
- fret, the monitor window is only an early preview of
- how conversion is doing. The actual page formatting is
- done later, after OCRSHARE can analyze the relationship
- of ALL of the symbols from the page. Therefore, the
- actual text (.TXT) file will show the most accurate
- translation.
-
- Looking at the output text file.
- -------------------------------
- You must leave OCRSHARE to examine your output text file
- PAGE1.TXT.
-
- 1. Press CTRL-X or select Exit to DOS on the INFO main menu.
- 2. At the DOS prompt type in the following command:
-
- TYPE PAGE1.TXT [enter]
-
- Additional Information on Training
- ==================================
-
- Naming the ink blots during training
- ------------------------------------
- Font training is the process by which the user teaches
- OCRSHARE to recognize new characters. In its simplest form,
- training is a two-step process.
-
- 1. OCRSHARE automatically locates and draws a box around
- an unknown ink pattern.
-
- 2. You type in one or more ASCII character(s) to name the
- ink pattern. OCRSHARE will remember your choice, and
- during OCR translation it will output these ASCII
- character when it detects another ink pattern closely
- approximating the one trained.
-
-
-
-
-
- Copr (c) 1984-1990 Solution Technology, Inc. page 24
- OCRSHR22.ZIP
-
-
- Proper training and symbol diversity
- ------------------------------------
- One of the major components of properly training a new OCR
- Font is to select an area of symbols from a scanned page of
- text that contains the font desired and a good mixture of
- upper case letters, lower case letters, numbers and punctua
- tion.
-
- Making an Alphabet Page
- -----------------------
- We have found that the following letter sequence, taken from
- a standard keyboard does quite a nice job of training up a
- character set. Not perfect, mind you, but easy to implement
- and use. Note that there are extra spaces between each
- symbol. Note also that the punctuation MUST NOT be put on
- separate lines otherwise its size and relative position
- relative to the rest of the letters in the font will be
- lost.
-
- ` 1 2 3 4 5 6 7 8 9 0 - = \ ~ ! @ # $ % ^ & * ( ) +
- A a B b C c D d E e F f G g H h I i J j K k
- L l M m N n O o P p Q q R r S s T t U u V v
- W w X x Y y Z z , . < > / ; ' : " [ ] { }
-
- Since we have found that it is usually sufficient to train
- each symbol 3 to 8 times, you should, for convenience, put a
- number of sets of the above sequence on the same page. In
- addition, you should generate a test paragraph (Quick brown
- fox paragraph) which uses all of the letters and punctuation
- in normal words and sentences.
-
- Point sizes and Scanning DPI
- ----------------------------
- We suggest using 14 point type when printing an Alphabet
- Page using a publishing system and doing your scanning at
- the highest available resolution to get the best possible
- typeface rendering. Since OCRSHARE has OMNISIZE we are
- training on a typeface, not a point size.
-
- Note: You need to press [character] [enter] to tell OCR
- SHARE's memory. If [enter] alone is pressed, no training for
- the current symbol is accomplished.
-
- Note: Should you inadvertently strike the wrong key(s), you
- may untrain the mistrained symbol. See Untrain Symbol
- described later for details.
-
-
-
-
-
-
-
-
-
-
-
- Copr (c) 1984-1990 Solution Technology, Inc. page 25
- OCRSHR22.ZIP
-
-
- About Autoskip and Manual Training Modes
- ----------------------------------------
- As shipped OCRSHARE comes up in Manual Training mode. This
- means that it stops on every symbol encountered. If you
- switch to AutoSkip training mode OCRSHARE will, when at
- least the minimum number of symbols have been trained, stop
- only on symbols which do not meet minimum acceptable
- criteria for matching symbols currently in the OCR font data
- base being trained.
-
- More on Autoskip Training
- -------------------------
- Autoskip means that OCRSHARE will automatically skip over
- any character during training that meets the following
- default criteria: The menu fields within the parentheses
- below are found in the Convert Settings Control Panel menu
- which can be accessed by pressing the [CTRL-C] key.
-
- - The training count for a character is at least 2 (per
- Min Training Count)
-
- - The confidence level is between 0-40 (per Confidence
- Threshold)
-
- - The confidence level of the second choice is at least
- 15 points more than the first choice (per Similarity
- Threshold)
-
- Autoskip will allow you to train a new font quickly and
- accurately by ignoring known characters and concentrating on
- unknown or unclear characters.
-
- About the Interactive Training screen.
- --------------------------------------
- Understanding this information is very important to properly
- train a font. The categories are as follows:
-
- Train Count: Indicates the number of times OCRSHARE has
- been trained on the particular character it is display
- ing in its Closest Choice List. You rarely need to
- train OCRSHARE on any single character more than 3
- times.
-
- Confidence: OCRSHARE is numerically telling you how
- confident it is that the Closest Choice List characters
- match the character it has drawn a box around at the
- top of the screen. A good match falls in the range
- between 0-40. A number higher than 100 indicates that
- OCRSHARE has no confidence it recognizes the boxed
- character at the top of the screen properly.
-
-
-
-
-
-
-
- Copr (c) 1984-1990 Solution Technology, Inc. page 26
- OCRSHR22.ZIP
-
-
-
- In Autoskip training mode, as the training progresses,
- OCRSHARE will start to see the characters for a second time,
- and the confidence rises (numbers start to drop
- dramatically) as the correct closest choices are displayed.
-
- Font Name: Indicates what font in memory OCRSHARE has
- chosen its closest choice list characters from. This
- indicator is only important when more than one font has
- been loaded in memory.
-
- Closest Choice List: Indicates the best matches OCRSHARE
- has for the boxed character at the top of the screen.
- The Closest Choice characters always appear in descend
- ing order with the best choice at the top of the list,
- the fourth best at the bottom of the list.
-
- Trained Alphabet: The list of characters in this box
- indicate to you which characters have been trained at
- least one time, and what characters OCRSHARE has yet to
- see. Every time you enter a new character into OCR
- SHARE's memory, it will darken that character on the
- Trained Alphabet list. Characters OCRSHARE has yet to
- see will remain halftone. An effectively trained font
- will have at least all upper case letters (A- Z), lower
- case letters (a-z), and punctuation marks darkened.
-
- Translation Progress Window: This window will give you an
- indication as to where the translation is in relation
- ship to the scanned page. The translation Progress
- Window is NOT to scale. Do not be concerned if the
- spacing of characters and words is not exact, or if
- extra carriage returns appear on the screen.
-
- Training Your Own Fonts for Recognition
- ---------------------------------------
- Although we have provided 3 pre-trained fonts, you may find
- that you get best results when you train your own pages for
- recognition. This is because there are so many fonts in
- existence and therefore we cannot train them all.
-
- Using Your New OCR Font
- -----------------------
- You can use NEWFONT alone or in conjunction with other
- existing fonts simply by loading into memory using the "Font
- Settings..." function menu.
-
- Multi font Capability
- --------------------
- It is important to note that OCRSHARE can utilize more than
- one font at a time. Multiple font capability will enable you
- to properly translate a very broad range of printed material
- containing mixed fonts on the same page.
-
-
-
-
- Copr (c) 1984-1990 Solution Technology, Inc. page 27
- OCRSHR22.ZIP
-
-
- Note: Very small type or low DPI scanning resolution may
- require you to train special fonts as the height of the
- characters may approach the lower mathematical limits of
- OCRSHARE's OMNISIZE capabilities.
-
- About the Convert Settings Control Panel
- ----------------------------------------
- The options found on the Conversion Control Panel allow you
- to modify the operation of OCRSHARE's OCR engine and identi
- cally affect both training a font and translating the char
- acter images to text. Normally you will not have to change
- any of these settings.
-
- Zone - This parameter determines which area of the image
- will be trained, or converted to text.
-
-
- "Whole page" will cause the training or conversion to
- start at the left topmost character on the page
- and will proceed to the right bottom character.
-
- "Select area" will cause training to begin with the
- top left most character inside a box drawn by
- Select Area and will proceed to the right bottom
- character in the capture box.
-
- "Page mask" behaves in the same way as Select Area,
- except that it uses a Select Area which remains
- fixed, page after page. This is very useful when
- converting text from the same zone on many pages.
-
- Training Mode - Allows you select one of two modes for a
- font, Manual Training and Autoskip training. Autoskip
- will be the mode that is used most often once you gain
- familiarity with OCRSHARE.
-
- "Autoskip" Autoskip training will occur at an
- accelerated pace, because as OCRSHARE becomes more
- confident of the symbols you are teaching it will
- stop to train only characters in which it has
- lower confidence.
-
- "Train OCR" Manual training stops on every symbol,
- regardless of the confidence level. This allows
- for training of characters up to 128 times each
- (optimal training is 3-8 times per character).
-
-
- View Translation - Translation Monitor ON allows you to view
- the conversion to text. Translation Monitor off causes
- a faster translation to occur since the text does not
- have to be displayed. NOTE: The actual text file is
- created at the very end of the viewing stage, when the
- message Formatting Output Page is displayed.
-
-
-
- Copr (c) 1984-1990 Solution Technology, Inc. page 28
- OCRSHR22.ZIP
-
-
- Set Max Symbol Size - Allows you to determine the largest
- character or graphic that OCRSHARE will recognize
- during training and translation. When this function is
- selected, the dimensions of the capture box (drawn with
- Select Area) will automatically be placed in Edit Max
- Symbol Size setting . Any graphic larger than this box
- will be ignored. This is extremely handy for processing
- pages containing both text and graphics, since the
- graphics will be ignored during OCR.
-
- Set Page Mask Bounds - This will determine the area of the
- page mask, or the area that will be recognized during
- training and translation. When this function is se
- lected, the dimensions of the capture box (drawn with
- Select Area) will automatically be placed in Edit Page
- Mask Bounds. Once the Training mode is set to Page
- Mask, OCRSHARE will deal only with images inside this
- area. Furthermore, only this area will be active from
- page to page while Page mask is selected.
- Set Spot Filter Size - Determines the size of the graphic
- below which is ignored during training and translation.
- For example, any graphic less than 15 pixels in size
- will be ignored. The shape of the graphic does not
- effect the filter in any way.
-
- Edit Max Symbol Size - This feature allows you to edit, the
- symbol size by entering the number of pixels you wish
- the length and width of the symbols to be. However,
- this function will rarely be used since the dimensions
- shown here are usually set by Set Max Symbol Size.
-
- Edit Page Mask Bounds - This feature allows you to edit the
- page mask by entering the number of pixels for each
- edge of the capture area. However, this function will
- rarely be used since the dimensions shown here are
- usually set by Set Page Mask Bounds.
-
- Min Training Count - This number, between 2 and 9 inclusive,
- determines the minimum number of times each character
- will be trained before OCRSHARE will consider skipping
- over it during training.
-
- Max Training Count - This number, between 2 and 9 inclusive,
- determines the maximum number of times any character is
- trained when Training Mode is set to autoskip.
-
- Confidence Threshold - This number only has effect in the
- Manual Train, Autoskip mode and does not usually need
- to be changed. OCRSHARE's certainty in identifying
- characters depends on a number of parameters, including
- the quality of the scan, the consistency and quality of
- the type and how well the font is trained. Changing the
- Confidence Threshold number determines how sure we want
- OCRSHARE to be in identifying characters before auto-
- skipping to the next character. Setting the confidence
-
-
- Copr (c) 1984-1990 Solution Technology, Inc. page 29
- OCRSHR22.ZIP
-
-
- threshold to 0, however, means that OCRSHARE will NEVER
- autoskip except for PERFECT matches. A Confidence
- Threshold of 40 is represents allowing about 15% error
- before a good match is failed during training.
-
- Similarity Threshold - This number only has effect in the
- Manual Train, Autoskip mode and does not usually need
- to be changed. Changing the Similarity Threshold deter
- mines how close the first two choices of symbols must
- be before OCRSHARE will stop and ask for confirmation.
- OCRSHARE wants to have confidence that it is correctly
- differentiating its choices before it skips to the next
- letter. Setting the Similarity Threshold to 0 effec
- tively turns the detector off.
-
- Handling Special OCR Problems
- =============================
-
- Broken Letters
- --------------
- You can lose proper symbol capture if letters are broken in
- such a way that a clear vertical white space is visible.
- This is usually caused by a light ribbon or an nth order
- photocopy. First, try rescanning using your utility soft
- ware with a darker contrast setting. Secondly, reprint the
- page with a new ribbon. Third, use manual training and skip
- over the broken letters.
-
- Broken Dot Matrix
- -----------------
- See Broken Letters (above).
-
- Run together Letters
- --------------------
- OCRSHARE can separate letters which overlap if there is at
- least a thin white space snaking down between them. OCR
- SHARE will, however, clump multiple symbols together
- into a single symbol if they physically touch or run togeth
- er in any manner no matter how thin the touch point.
- First, try scanning with a lighter contrast setting or a
- higher Dpi resolution. Secondly, reprint the script training
- page with a space between each letter. Third, use the Erase
- Inside function with a one pixel wide box to erase a thin
- line between the joined letters. Fourth, use manual training
- and either skip over the run together letters or train them
- as letter pairs.
-
- Big Dirt Spots
- ---------------
- If the page has dirt spots larger than the minimum spot size
- because these spots might be treated as a valid symbol.
- First, try cleaning the scanner glass. Secondly, inspect
- your printed training page and use white out to cover any
- spots you may find (these are usually flecks in the paper
- itself). Third, try scanning with a lighter contrast set
-
-
- Copr (c) 1984-1990 Solution Technology, Inc. page 30
- OCRSHR22.ZIP
-
-
- ting to reduce sensitivity to spotting. Finally, use the
- Erase Inside and Erase Outside function to eliminate any
- problem areas.
-
- Underscores
- -----------
- On some printers the underscore symbol is so far under the
- normal baseline that it often becomes separated into a
- separate line. If this is occurring, it may cause an occa
- sional spurious symbol to appear on a line by itself.
-
- The % Symbol
- ------------
- In some fonts, OCRSHARE will (sometimes) separate the %
- symbol into a lower case o followed by a /o. If this is
- occurring, you may have to train using the Manual training
- modes. We suggest tagging the lower case o as %o and the /o
- as %/o then using the font editor, redefining %o as the
- ignore symbol (~~) and %/o as the % symbol.
-
- Italics
- -------
- For reasons similar to the % symbol, in some italic fonts,
- OCRSHARE set the dots above i's and j's and the dots below
- ?'s and !'s. If this is occurring, you should train the page
- using a manual training mode. We suggest tagging and redefi
- nition in a manner similar to that used for the % symbol
- above.
-
- Advanced Training Methods
- =========================
-
- While there are many printed symbols which do not appear on
- your keyboard, you will quite often find that you want to
- recognize them and translate them into something sensible in
- your output text file.
-
- Training Foreign Characters
- ---------------------------
- Foreign alphabets have accented symbols which are not found
- in the English alphabet and, as a consequence, do not occur
- on US keyboards.
-
- Handling foreign letters and special symbols requires two
- steps.
-
- - Train the foreign font database, coding the special
- characters as described.
-
- - Edit the font database, redefining all of the specially
- coded characters so that it will, when performing OCR,
- form the correct letters and output them to your output
- text file.
-
-
-
-
- Copr (c) 1984-1990 Solution Technology, Inc. page 31
- OCRSHR22.ZIP
-
-
- Training from FRENCH.ATX.
- ------------------------
- 1. Load the FRENCH.ATX image file.
- 2. Use Font Settings CTRL-F to define a new font called
- FRENCH.FTF.
- 3. Select Train Font F7, or press F7.
- 4. Train normal letters with single keystrokes. When you
- encounter a foreign language symbol, refer the to the
- next section to learn how to type in a two-character
- sequence for the symbol. Continue until you have
- completed training the page. Note that the double
- letter symbols are NOT displayed in the alphabet
- window.
-
- Training Foreign Language Letters
- ---------------------------------
- As you train, you will need to name the letters of the
- various foreign alphabets which may not appear on your
- keyboard. When this occurs, you will have two choices. The
- first is to input the proper ASCII extended character
- number by depressing the ALT key and typing the number on
- your number key pad. A chart with all of the appropriate
- ASCII extended character numbers has been supplied in the
- Trouble Shooting section of this manual.
-
- There is an alternative method. You can make up an
- equivalent training name for each of these unusual symbols.
- The following examples show some of the cases we have
- encountered and the practical coding rules we made up to
- train these foreign letters on an English keyboard.
-
- Font Editing for correct OCR output
- -----------------------------------
- When you are finished training from FRENCH.ATX, you need to
- edit how the font outputs the foreign language symbols to
- the text file. Select Font Settings... CTRL-F from the OCR
- main menu.
-
- 1. Select the "Font Settings..." sub menu
- 2. Move the cursor bar down to FRENCH.FTF
- 3. Move the cursor bar down to Edit and press [enter].
- 4. Pressing the [end] key.
- 5. Move the cursor bar to the first multi letter symbol you
- want to redefine (e') and select it by pressing
- [enter].
- 6. Select Redefine from List.
- 7. Move the cursor down the list with the down arrow key
- until it rests on the forward accented e symbol. Press
- [enter] to select this e'.
- 8. Repeat steps 5,6, and 7 for each of our specially coded
- French letter.
- 9. Press [esc] when you are finished to get back to the
- "Font Settings..." menu
- 10. Save your newly modified font
-
-
-
- Copr (c) 1984-1990 Solution Technology, Inc. page 32
- OCRSHR22.ZIP
-
-
- Using your new French OCR font
- ------------------------------
- Provided you have captured all of the letters, numbers and
- punctuation of the typeface, you are now ready to use the
- OCR function (F4) to convert French documents printed in
- that typeface. Let's try it now on the same FRENCH.ATX that
- we have loaded. Set an output file using Text Settings...
- CTRL-T. Press F4 to activate the conversion.
-
-
- Training Special Symbols
- ========================
-
- OCRSHARE's ability to recognize special symbols can be quite
- handy when you intelligently redefine them for your
- particular needs. In this section you will learn how we code
- and redefine a few special symbols in ways that generate
- useful output. This section is intended to be used as an
- example as, in reality, there are thousands of potential
- uses for this technique.
-
- Copyright symbol
-
- Code as (c)
-
- Bullets
-
- We suggest that you should code square bullets as "[-]"
- and round bullets as "(-)". For example, in Ventura
- Publisher text we use the font editor to redefine both
- of our training keystrokes "(-)" and "[-]" to the same
- string "@BULLET = ".
-
- The Hand Symbol.
-
- If you want to train OCRSHARE to recognize the Hand
- symbol we suggest that you code this as "!H!". For
- Ventura Publisher we redefine "!H!" to "@NOTE ="
-
- OCR Tuneup Fonts
- ================
-
- Occasionally you will use an existing font which does not
- give you a good translation. Often the reason is a variation
- in the design of the printed typeface. For example, there
- are over 25 significant variations of the Courier typewriter
- font. Other frequent causes are letter distortions caused by
- poor quality printing or nth order photocopying. Another
- frequent cause is the accidental ligatures generated by
- either scanning at too low a resolution or by the original
- typesetter setting the letters so closely that they
- physically touch. In either case this causes a new,
- untrained, symbol to be defined.
-
-
-
-
- Copr (c) 1984-1990 Solution Technology, Inc. page 33
- OCRSHR22.ZIP
-
-
- The easiest way to handle this is to train a tune-up OCR
- font which will augment your regular font. A tune-up font is
- loaded in addition to your base font to supply the
- additional trained symbols you need for this particular
- document.
-
- Some tune-up fonts you create will be temporary, meant only
- to aid the translation of a specific problem document. These
- can be discarded after you use them. Others tune-up fonts
- will be variations of a base OCR font. These you should keep
- to load again with your base font.
-
- Creating a tune up font.
- ------------------------
- Note: You can press the ESC key at any time to back out of
- these menus without doing anything.
-
-
- 1. Select the Font Settings...CTRL-F menu.
- 2. Select Load Existing Font. Select and load your base
- font, e.g., TIMES.FTF.
- 3. Select Create New Font. [enter] TIMES-HP. This will
- create the tune up OCR font TIMES-HP.FTF.
- 4. Load or scan the page containing the font to be tuned.
- 5. Interactively train the whole page or a selected area
- of the page where the problem symbols are. (OCRSHARE
- will only stop on symbols which are problematical in
- BOTH the base and the tuneup OCR databases). Note: type
- in multi character strings when OCRSHARE stops on an
- run together characters.
- 6. When completed, save the tune-up font in the normal
- manner.
-
- 7. To use the tune up later, load the base font AND the
- tune-up font and proceed normally with your OCR trans
- lation.
-
- Derivative Fonts
- ================
-
- Occasionally you will use an existing font which does not
- give you a good translation. When the primary reason is a
- variation in the design of the printed typeface, as an
- alternative to tuning up a font, you may want to either
- correct the existing font or create a derivative font. In
- either case the procedure is identical.
-
- Testing an Existing Font.
- -------------------------
- 1. Set up the Convert Settings control panel for normal
- interactive training.
- 2. Load the OCR base font which seems to be giving prob
- lems.
- 3. Load or scan the sample page which is giving problems
- when using the specified base font.
-
-
- Copr (c) 1984-1990 Solution Technology, Inc. page 34
- OCRSHR22.ZIP
-
-
- 4. Perform a test translation. Write down the symbols
- which are not being recognized properly.
-
- Editing the font.
- ----------------
- 5. You now need to untrain each of the problem characters
- in the OCR font. Bring up the Font Editor by selecting
- Font Settings...CTRL-F, then highlighting the font and
- pressing [enter], and selecting Edit.
- 6. Use the down arrow key to move to each problem charac
- ter noted in the test translation.
- 7. Select it.
- 8. Select Untrain Symbol. The definition of that symbol
- will be deleted from that fonts database in memory. It
- is now as if the symbol was never trained in the first
- place. The font database on disk has not been modified
- because we haven't saved the one in memory yet.
- 9. Repeat steps 7,8 and 9 for each problem character.
-
- Retraining your font.
- --------------------
- 10. Using the newly edited font AND the same image,
- activate Train Font F7. OCRSHARE will stop on each of
- the symbols you've untrained (and occasionally some of
- the other normal letters) so that you can retrain, or
- enhance the training of already-trained symbols. This
- should go very quickly.
- 11. Once you've finished training the font, you should save
- it. If you save the font under a different name, you
- have created a derivative font. If you save the font
- under its original name, you have corrected the exist
- ing font.
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- Copr (c) 1984-1990 Solution Technology, Inc. page 35
- OCRSHR22.ZIP
-
-
- Trouble Shooting
- ================
-
- The following is a checklist of items to verify before running
- OCRSHARE.
-
- I installed OCRSHARE according to the instructions, but when I
- started it, I did not get the menus display.
-
- a) The first thing to try is delete the OCRSHARE.CFG file
- if it exists then restart OCRSHARE. If the graphics
- display adaptor shown when you restart is not the one
- you have installed then you will have to override OCR
- SHARE's selection.
-
- b) If the above fails, check to see if the monitor and
- display card you have is really a graphics display. If
- it is, make sure that you are selecting the correct
- override switch for the graphics display card you have
- installed.
-
- c) Delete the OCRSHARE.PRO setup file.
-
- The machine locks when I try to run OCRSHARE
-
- a) Be sure enough memory is available, at least 520K free.
- Use CHKDSK to be sure. Also, release all TSR's (termi
- nate and big resident programs) you may have loaded
- before running OCRSHARE, we need the space.
-
- b) Make sure that a proper screen driver is loaded (see
- Selecting a Screen Driver in Installation.
-
- The machine just jumps back to DOS when I run OCRSHARE.
-
- This may be because you are using the EMS simulator and have
- very little space left on your hard disk. Be sure to make 2-
- 4 megabytes available on your hard disk, or run from another
- hard disk or partition with more space on it.
-
- The menu display is up but the keyboard won't work.
-
- Press the NUM LOCK key; the NUM LOCK light on the keyboard
- should always be OFF.
-
- The Mouse Won't Work
-
- OCRSHARE does NOT support a mouse at this time unless your
- mouse is programmable in such a way that it will emulate
- keystrokes such as up arrow, down arrow, enter, Page Down,
- etc. Consult your mouse manual for specific instructions for
- setting up your mouse in this manner.
-
-
-
-
-
- Copr (c) 1984-1990 Solution Technology, Inc. page 36
- OCRSHR22.ZIP
-
-
- When saving an image, the hard disk light "churns forever", and
- the on screen status bar barely moves.
-
- There is not enough hard disk space available to which to
- save the image. Press Esc, and either use the File Manager
- (INFO menu) to delete some files and make more room, or Exit
- to DOS CTRL-X to do the same.
-
- The scanned image is garbage, or no image shows at all.
-
- The most likely problem here is that you are using a scanner
- which has its own image buffer mapped into the address space
- above 640K and another device such as an graphics display
- card, an EMS card or a network card which conflicts with it.
- Verify that an EMS conflict is or is not occurring by
- selecting Load Page F5 and loading the EMS test page. If the
- cross hair pattern is not clean has the same garbage as the
- scanned image, there is an EMS conflict.
-
- Resolve the address conflict by changing the address of
- the scanner card or the address of the conflicting card.
- Make sure that you edit the CONFIG.SYS file to tell the
- drivers.
-
- Finally, make sure that the scanner is clean. You can
- easily check to see if the unit is clean by scanning a pure
- white page. If an image with many spots results, clean the
- glass.
-
- The Quick Start procedure shows nothing/garbage in the
- Translation Window.
-
- Your scanner is probably not working properly. Press Scan
- Page F3 to scan the WELCOME TO OCRSHARE page. An image of
- the page should show up on your monitor in black characters
- on white background. If not, walk through the Scanner
- Problems, and Scanning sections earlier in Trouble shooting.
-
- When training, nothing happens.
-
- If nothing happens, and you are "kicked" back to the main
- menus, make sure that the loaded image consists of dark
- characters on light background. If not you must invert the
- page using "Invert Page" off the EDIT menu.
-
- OCRSHARE is skipping characters that it has never seen before.
-
- This is probably because the untrained character being
- skipped is too much like another that has been trained.
-
- a) Press Esc to exit the training screen.
-
-
-
-
-
-
- Copr (c) 1984-1990 Solution Technology, Inc. page 37
- OCRSHR22.ZIP
-
-
- b) Select Convert Settings... CTRL-C and set the Training
- Mode to Train OCR manual.
-
- c) Use Select Area F9 to capture the word(s) in which the
- character(s) being skipped are located.
-
- d) Select Train Font F7 and train the character(s) that
- were being skipped.
-
- e) When finished, you may reset the Training mode to
- Train OCR, autoskip.
-
- I keep seeing "Out of Memory at row XXX. Please set narrower
- window".
-
- If your page is very complex, you made need to select an
- area of text to train/convert on, rather than the whole
- page. More often than not, the cause is skewed, or sloping,
- lines of text. Use Straighten Page on the GRAPHICS menu to
- horizontally align the image. Also, perform a CHKDSK from
- DOS to verify that at least 520K is free before running
- OCRSHARE. Be sure to release any TSR's before running
- OCRSHARE.
-
- How do I know which font to load when converting to text?
-
- A font guide comes with the registered version of OCRSHARE.
- If you have this guide. Compare the typestyles on your page
- to those of the samples and load the one(s) that are the
- most similar. If you are not sure which font is most like
- yours, you can load up to five fonts that are most like the
- one on your page.
-
- There are too many/too few line feeds and or/spaces in the text
- file.
-
- a) View the text (.TXT) file from DOS using the TYPE
- command, or use your favorite word processor. The
- Translation Window in OCRSHARE is not to scale, meaning
- not all spacing is shown correctly.
-
- b) Remember to set Proportional line Spacing if the type
- on the page is proportionally spaced, and to set Fixed
- line spacing if the type is fixed-spaced or typewritten
- material.
-
- How do I skip a characters during training?
-
- Press [enter] only without naming the symbol. A symbol is
- trained only when character(s) are entered and return is
- pressed.
-
-
-
-
-
-
- Copr (c) 1984-1990 Solution Technology, Inc. page 38
- OCRSHR22.ZIP
-
-
- What happens if I try to train a font when more than one font is
- loaded?
-
- OCRSHARE will show you the "best match" for the current
- character based on all the loaded fonts, even though only
- the font set to "Train" is being modified. Unless you are
- training an existing font as a derivative font (see Lesson
- 5), you should remove all fonts other than the one being
- trained for best results. Then simply reload all the desired
- fonts for conversion to text after training is finished.
-
- When F7 (Train Font is selected) nothing happens, and the program
- returns to the menus.
-
- a) Make sure that the image is dark characters on light
- background. If not, change the Image Background
- (Scanner settings... CTRL-S) to the opposite of what is
- currently displayed, and re-scan the page.
- b) Make sure that Whole page is the Conversion zone (in
- Convert Settings... CTRL-C) selected. Or, that the
- Select area or Page mask are of sufficient size to
- capture the area of the page you want to train on.
-
- The characters are badly broken or OCRSHARE is stopping on pieces
- of a character during training.
-
- 1. Try increasing the scanner contrast to a higher setting
- (toward Dark) so that the characters darken up and the
- broken pieces become continuous (see Lesson 1).
- 2. Use the following "Tag" coding of the character as to
- ignore or consolidate parts of the broken symbols.
-
- a. "Tag" the first piece of the character by entering
- a two- or three- character sequence, then press
- return.
- b. Tag the second piece of the character by entering
- the correct symbol. For example, if the character
- is supposed to be an "A", but appears in two
- pieces, identify the first piece as "AA", and the
- second as "A".
- c. When training is completed, prepare to Edit the
- font per the instructions under Editing the Font.
- d. Once the Font list window is displayed, Page Down
- to the end of the table and locate the two- and
- three- character symbols. Highlight the row with
- "AA" and press return.
- e. Select Redefine as Text and enter the ignore
- marker, i.e., two tilde marks (~~) .
- f. Repeat step 5 for each of the remaining multi-
- character tags.
- g. During conversion to text, the first piece will
- now be recognized, but nothing will be output; the
- second piece will be translated to the symbol
- assigned to it.
-
-
-
- Copr (c) 1984-1990 Solution Technology, Inc. page 39
- OCRSHR22.ZIP
-
-
- More than one character is included in the training box on the
- screen.
-
- a) Try setting the scanner resolution to its highest DPI
- setting. This insures that the scanner has the best
- chance of seeing white space (if any) between adjacent
- characters. (see Lesson 2).
- b) If the scanner is already at its highest DPI setting
- and you still have excessive problems with run together
- letters, try decreasing the scanner contrast a little
- bit towards lighter. This insures that the scanner has
- the best chance of seeing white space (if any) between
- adjacent characters.
- c) Some characters may be actually run together. Train
- these groups of run together characters as a multi-
- letter (ligatured) symbols. You will usually find that
- the typesetter consistently ran certain letter pairs
- together. As long as the sequence is 40 characters or
- less, the corresponding sequence can be entered. For
- example, if the sequence "iff" should appear in the
- training window instead of "i", "f", and "f", enter
- "iff" from the keyboard, then press return.
-
- The dots on the i's and j's are not included in the box during
- training.
-
- Try setting the Spot Filter (in Convert Settings... CTRL-C)
- from 15 to 8 dots. OCRSHARE is not finding the dot since its
- size is smaller than the one specified in the Spot Filter
- setting.
-
- More than one character, vertically, or more than one full line,
- are being displayed during training.
-
- a) Occasionally, more than one line of text will be so
- closely spaced that descending characters (e.g. "y")
- will hang into the next line of text. In these
- instances, OCRSHARE cannot determine where one
- character ends and the other begins, so it will capture
- both as one character. Do not train on any such symbol,
- just press return only.
-
- b) Another possibility is that two lines have some dirt,
- malformed or particularly large characters invading the
- interline space which causes a lack of line separation.
- The trick here is to provide OCRSHARE with just enough
- space between the lines of text so that it can
- distinguish among them. A very simple method for doing
- this is to draw a box (using Select Area, F9) between
- the two lines whose dimensions are the length of the
- line horizontally, and about 1/32 to 1/16 inch in
- height (a very short rectangular box is needed). Once
- the box is drawn, select Erase Inside (GRAPHICS menu)
- and all marks within this box causing poor line
- separation will be removed.
-
-
- Copr (c) 1984-1990 Solution Technology, Inc. page 40
- OCRSHR22.ZIP
-
-
-
- Some lines of text are accurately translated, yet others are
- completely mis-translated; or, characters from two consecutive
- lines are appearing in the training window.
-
- a) This usually happens when OCRSHARE is having trouble
- with line separation. That is, OCRSHARE cannot
- distinguish the bottom of one line from the top of the
- next. Check the physical page to make sure there are no
- stray marks between the lines that are displayed
- simultaneously. Also check for any underlined words.
- Try erasing or masking these marks and re-scan the
- page.
-
- b) Make sure the image is straight. If it is not, re-scan
- the page, or Straighten the image (see Lesson 1).
-
- The larger characters, and sometimes normal-sized characters, are
- being ignored during training and/or translation.
-
- Remember that any symbol larger than the Max Symbol Size
- setting (Convert Settings...) will be ignored. This will
- include any large single characters and any multi-character
- symbols (e.g. ARL) whose combined size is larger than the
- specified symbol size. Even though the characters A,R,L are
- normal-sized, their being so close makes them appear as one
- character to OCRSHARE.
-
- Can OCRSHARE handle columnar material?
-
- Yes. It is best to convert columns to text one at a time by
- using Select area and drawing a box around each column one
- at a time.
-
- What is the best way to OCR a complex page, i.e. one with both
- pictures and text on it?
-
- There are actually three ways a complex page can be
- processed:
-
- a) Perhaps the best way is to do nothing, as OCRSHARE will
- automatically ignore graphics (pictures) as it performs
- OCR on the page
-
- b) Select Convert to Layers (GRAPHICS menu). This feature
- will separate the image into three files. The first,
- LARGE.RAS will contain all images (usually pictures)
- greater than the Max Symbol Size (Convert Settings...).
- MEDIUM.RAS will contain all the images (usually text
- images) greater than the spot filter setting and less
- than or equal to the Max Symbol Size. SMALL.RAS will
- contain most of the stray marks and spots on the page.
- Converting to layers is a convenient way of
- automatically separating the page into its various
- components.
-
-
- Copr (c) 1984-1990 Solution Technology, Inc. page 41
- OCRSHR22.ZIP
-
-
-
- c) Use Erase Inside on picture portions of the page. If
- you do not wish to keep the pictures, you may decide to
- remove them entirely form the image so that you can see
- what you are doing with the text more readily.
-
- Can OCRSHARE handle landscape-oriented material
-
- At this time, OCRSHARE does not have a 180-degree rotate
- function, but will be in the future.
-
- Can OCRSHARE run under Windows?
-
- Yes. Conditionally. OCRSHARE is NOT a Microsoft Windows
- compatible program. Either run OCRSHARE without a .PIF file
- and use all of Windows defaults or set up a .PIF file which
- gives OCRSHARE the MAXIMUM amount of memory and indicate
- that it directly modifies the screen and keyboard. Keep in
- mind that OCRSHARE, like Windows, is a large program and
- thus will probably run best as a stand alone process.
-
- Can fonts trained at one resolution be used to translate text
- scanned at another?
-
- Yes, but the accuracy can be lower as compared to the
- accuracy when using a font trained at the same resolution as
- the scanned page.
-
- Which TIFF format does OCRSHARE read and write to?
-
-
- Our TIFF format is essentially the same as the ones generat
- ed and read by Hewlett-Packard Scan Gallery software and
- have been tested against a number of other software pack
- ages.
-
-
- How do you train if more than one font is present on a page?
-
- It is best to train all the characters of one font and save
- it to a file, remove this font, and then to train all the
- characters of another font and save this training in another
- file. Try not to mix characters of different fonts during
- training of any given font. Remember that Select Area (F9)
- can be used to capture various words and lines of text to
- augment the training process should characters of one font
- be intermixed with characters of another font.
-
- Can OCRSHARE be trained to read quotation marks?
-
- No. Each piece of the mark must be assigned a ' since
- OCRSHARE captures only one ink pattern at a time.
-
-
-
-
-
- Copr (c) 1984-1990 Solution Technology, Inc. page 42
- OCRSHR22.ZIP
-
-
- Can more than one font be trained at the same time?
-
- No. If more than one font is loaded (Font Settings), then
- only one o these can be set to Train mode at any given time;
- the rest are set to Use.
-
- How many bytes of (EMS) memory do scanned images require?
-
-
- It depends on the resolution, but the number can be deter
- mined by simple calculation: At a resolution of 200 Dpi,
- there are 200 bits by 200 bits, or 40,000 bits of informa
- tion per square inch). Therefore, the formula for determin
- ing the number of bytes required is (Resolution x Resolution
- x length x width)/8 bits per byte. On a 8 1/2 x 11 page,
- then, there are (200 x 200 x 8 1/2 x 11) / 8 or 467,500
- bytes of memory required to store this image.
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- Copr (c) 1984-1990 Solution Technology, Inc. page 43
- OCRSHR22.ZIP
-
-
-
- OCRSHARE V2.2 Registration Form
- ===============================
-
- Name: ------------------------------------------------------
-
- Address: ___________________________________________________
-
- ___________________________________________________
-
- City:_______________________________________________________
-
- State(or Provence+Country):_________________________________
-
- Zip/Postal Code:_____________
-
- Phone: _______________________
-
-
- **** YOU MUST BE REGISTERED TO QUALIFY FOR UPGRADES ****
-
-
- Qty
-
- ____ Registration(s) of OCRSHARE $45.00 each $__________
-
- ____ OCR Font Disks $25.00 each $__________
-
- ____ Advantex OCR $395.00 each $__________
-
- ATXSHARE/Advantex Media Selection
-
- [ ] 5.25 DSDD (640K)
- [ ] 5.25 DSHD (1.2Meg)
- [ ] 3.5 DSDD (720K)
-
- TOTAL $__________
-
-
- Make Check or money order should be made payable to, UPS COD
- availible on request:
-
- Solution Technology, Inc
- PO Box 273372
- Boca Raton, Florida 33487
-
-
-
-
-
-
-
-
-
-
-
-
- Copr (c) 1984-1990 Solution Technology, Inc. page 44
- OCRSHR22.ZIP
-
-
- NOTE TO RETAIL DEALERS:
- =======================
-
-
- For more product information on OCRSHARE, ATXSHARE or Advan
- text Retail Dealers should contact Solution Technology, Inc.
- directly at the address and phone number given on the front
- cover of this manual.
-
- NOTE TO SHAREWARE DEALERS:
- ==========================
-
- Shareware Dealers must contact STI directly to apply for
- distribution. Any request for shareware catalog inclusion
- requires that you send a current copy of your product cata
- log with your application, along with a blank diskette and
- mailer. Once you have been approved, you will receive
- written permission from Solution Technology to include
- OCRSHARE in your library for distribution. Additionally,
- you will receive your diskette back containing the latest
- OCRSHARE release. You must, in addition send each issue of
- your catalog as they are published to update our mailing
- list of active vendors. STI's continued receipt of your
- catalog will both enable us to verify that you have the most
- recent OCRSHARE release (within reason), and that your
- business is still active.
-
- We will send authorized shareware dealers each MAJOR revi
- sion of OCRSHARE as it is released so that your library is
- kept up to date.
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- Copr (c) 1984-1990 Solution Technology, Inc. page 45
-